1 Maintenance

1.1 Restart R session in Rstudio

Use keyboard shortcut Ctrl + Shift + F10

1.2 Load the libraries

set.seed(1234)

if (!require("tidyverse")) install.packages("tidyverse"); library("tidyverse")
if (!require("sf")) install.packages("sf"); library("sf")
if (!require("tmap")) install.packages("tmap"); library("tmap")
if (!require("spdep")) install.packages("spdep"); library("spdep")
if (!require("sjmisc")) install.packages("sjmisc"); library("sjmisc")

tmap_mode("view")

1.3 Load the data

# prepared earlier
SA2_SEIFA <- readRDS("data/SA2_SEIFA.Rds")

2 From sf to sp

So far we worked with spatial objects represented by simple features from sf library. Long before sf library appeared R used different representation of spatial objects provided by sp library. Many packages might still depend on it so we will learn how to transform the data between then.

One notable example is spdep package that offers very powerful se tof tools to analyse spatial patterns of data.

We will reproject our SA2 & SEIFA data prepared before and convert it to sp object using as function:

SA2_SEIFA_proj <- SA2_SEIFA %>% 
  st_transform(3112) %>% 
  filter(!is.na(IRSAD_s)) %>% 
  as("Spatial")

Our object is now of a different class:

class(SA2_SEIFA_proj)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

Because of that it has different properties and will behave differently. For instance. We cannot simply view the data associated with polygons:

View(SA2_SEIFA_proj)

Instead of that we have to access so called data “slot” by using @ notation:

View(SA2_SEIFA_proj@data)

Luckily, tmap library can work with both sf and sp objects so all we have learned so far can be still used. Try creating a quick map (qtm function) with SA2_SEIFA_proj.

3 Spatial neighbours

In order to examine relationship between regions we have to define how we would like to define who is included as a neighbour. One common approach is to define “queen” type of neighbouring (queen being relation to the queen’s movement in chess). We can create neighbours list from polygon list with poly2nb function:

(SA2_weights_queen <- poly2nb(SA2_SEIFA_proj, queen = TRUE))
## Neighbour list object:
## Number of regions: 305 
## Number of nonzero links: 1764 
## Percentage nonzero weights: 1.896264 
## Average number of links: 5.783607

We can now turn this object into spatial weights (listw class) that are required by some tools:

(SA2_weights_queen_list <- nb2listw(SA2_weights_queen, style="W", zero.policy=TRUE))
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 305 
## Number of nonzero links: 1764 
## Percentage nonzero weights: 1.896264 
## Average number of links: 5.783607 
## 
## Weights style: W 
## Weights constants summary:
##     n    nn  S0       S1       S2
## W 305 93025 305 111.2742 1255.536

Picture might give it better overview than description - let’s plot the object to see how polygons are connected:

plot(SA2_SEIFA_proj, border = "grey60")
plot(SA2_weights_queen, coordinates(SA2_SEIFA_proj), pch = 19, cex = 0.6, add = TRUE)

We now defined all polygons that touch any part of their boundary to become neighbours. We will now work with statistics that use this neighbours to determine degree of global and local clustering.

4 Global spatial clustering

In order to check if our data is clustered in space we could use Moran’s I test for spatial autocorrelation. moran.test function from spdep package will perform that:

(SA2_IRSAD_s_Moran <- moran.test(SA2_SEIFA_proj$IRSAD_s, listw = SA2_weights_queen_list))
## 
##  Moran I test under randomisation
## 
## data:  SA2_SEIFA_proj$IRSAD_s  
## weights: SA2_weights_queen_list    
## 
## Moran I statistic standard deviate = 17.232, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.586990113      -0.003289474       0.001173436

Interrpretation of the statistic is as follow:

Values of I usually range from −1 to +1. Values significantly below -1/(N-1) indicate negative spatial autocorrelation and values significantly above -1/(N-1) indicate positive spatial autocorrelation.

The significance of the statistic can be obtained from the Monte Carlo permutation test:

(SA2_IRSAD_s_Moran_MC <- moran.mc(SA2_SEIFA_proj$IRSAD_s,
                                 SA2_weights_queen_list,
                                 nsim = 999))
## 
##  Monte-Carlo simulation of Moran I
## 
## data:  SA2_SEIFA_proj$IRSAD_s 
## weights: SA2_weights_queen_list  
## number of simulations + 1: 1000 
## 
## statistic = 0.58699, observed rank = 1000, p-value = 0.001
## alternative hypothesis: greater

We can also get some insights into spatial patterning of the data by using Moran scatterplot, which plots data against its spatially lagged values,

moran.plot(SA2_SEIFA_proj$IRSAD_s, listw = nb2listw(SA2_weights_queen),
           xlab = "ISRAD score", ylab = "Lag of ISRAD score")

Data points are divided into four “quadrants” depending on where they lie in respect to their values to lagged values relation. For instance top-right quadrant contains higher values of variable surrounded also by neighboring areas with higher values as well. That could point to spatial clustering. Bottom-right quadrant on the other hand points to high values of variable surrounded by areas with lower values - indication for spatial outliers. We will use these quadrants and visualize them on maps.

5 Local spatial clustering

Moran’s I tells us if our data are globally clustered without indicating where exactly it is happening. We can learn about location of clusters and outliers by using local indicators of spatial association (LISA).

5.1 Data preparation

We first prepare our data by creating a standardized version of it using scale function.

SA2_SEIFA_proj$IRSAD_s_std <- scale(SA2_SEIFA_proj$IRSAD_s) 

We also prepare spatially lagged variable using spatial weights we created above.

SA2_SEIFA_proj$IRSAD_s_lag <- lag.listw(SA2_weights_queen_list, SA2_SEIFA_proj$IRSAD_s_std)

We can create a variable with four classes depending on which quadrant of the Moran’s plot the region lies on the scaterplot of standardized versus lagged values. Four possible conditions are:

SA2_SEIFA_proj$quad <- NA
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std > 0 & SA2_SEIFA_proj$IRSAD_s_lag > 0, "quad"] <- 1
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std < 0 & SA2_SEIFA_proj$IRSAD_s_lag < 0, "quad"] <- 2
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std > 0 & SA2_SEIFA_proj$IRSAD_s_lag < 0, "quad"] <- 3
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std < 0 & SA2_SEIFA_proj$IRSAD_s_lag > 0, "quad"] <- 4

Now we are ready to map our results. Each region belongs to one quadrant:

frq(SA2_SEIFA_proj$quad)
## 
## # x <numeric> 
## # total N=305  valid N=305  mean=1.93  sd=0.97
##  
##   val frq raw.prc valid.prc cum.prc
##     1 125   40.98     40.98   40.98
##     2 106   34.75     34.75   75.74
##     3  45   14.75     14.75   90.49
##     4  29    9.51      9.51  100.00
##  <NA>   0    0.00        NA      NA

We can map it into four categories giving it descriptive names and colours matching clusters (reds) and outliers (blues):

tm_shape(SA2_SEIFA_proj) +
  tm_fill(col = "quad", alpha = 0.5, style = "cat",
          palette = c("red", "blue", "lightpink", "skyblue2"),
          labels = c("High-High", "Low-Low", "High-Low", "Low-High"),
          title = "Local Moran's I")

5.2 LISA

LISA analysis can be performed using localmoran function from spdep package. We use variable of interest (IRSAD_s score) and listw object with spatial weights of neighbours:

SA2_IRSAD_s_lm <- localmoran(SA2_SEIFA_proj$IRSAD_s, listw = SA2_weights_queen_list)

We obtain an object of localmoran class that stores five results:

summary(SA2_IRSAD_s_lm)
##        Ii                 E.Ii               Var.Ii       
##  Min.   :-1.089355   Min.   :-0.003289   Min.   :0.07981  
##  1st Qu.: 0.007243   1st Qu.:-0.003289   1st Qu.:0.13914  
##  Median : 0.251212   Median :-0.003289   Median :0.16287  
##  Mean   : 0.586990   Mean   :-0.003289   Mean   :0.18720  
##  3rd Qu.: 0.958551   3rd Qu.:-0.003289   3rd Qu.:0.19610  
##  Max.   : 4.996230   Max.   :-0.003289   Max.   :0.49514  
##       Z.Ii            Pr(z > 0)      
##  Min.   :-2.45254   Min.   :0.00000  
##  1st Qu.: 0.02124   1st Qu.:0.01074  
##  Median : 0.60875   Median :0.27134  
##  Mean   : 1.44160   Mean   :0.28056  
##  3rd Qu.: 2.29933   3rd Qu.:0.49153  
##  Max.   :15.24860   Max.   :0.99291

The values here indicate:

Ii - local moran statistic E.Ii - expectation of local moran statistic Var.Ii - variance of local moran statistic Z.Ii - standard deviate of local moran statistic Pr() - p-value of local moran statistic

We can now extend quadrant assignment by additionally marking results significant depending on p-value obtained from the localmoran test (stored in the fifth column of results).

SA2_SEIFA_proj$quad_sig <- SA2_SEIFA_proj$quad
SA2_SEIFA_proj@data[SA2_IRSAD_s_lm[, 5] >= 0.05, "quad_sig"] <- 5  

Taking itnto account significance of results, we can add one more category to the results for regions that are not significantly different according to permutation test:

frq(SA2_SEIFA_proj$quad_sig)
## 
## # x <numeric> 
## # total N=305  valid N=305  mean=3.87  sd=1.67
##  
##   val frq raw.prc valid.prc cum.prc
##     1  51   16.72     16.72   16.72
##     2  47   15.41     15.41   32.13
##     5 207   67.87     67.87  100.00
##  <NA>   0    0.00        NA      NA

Since we only have first and second category, we modify our palette and labels arguments and also remove colour for insignificat results:

tm_shape(SA2_SEIFA_proj) +
  tm_fill(col = "quad_sig", alpha = 0.5, style = "cat",
          # palette = c("red", "blue", "lightpink", "skyblue2", "white"), 
          # labels = c("High-High", "Low-Low", "High-Low", "Low-High", "Not Signif."), 
          palette = c("red", "blue", NA), 
          labels = c("High-High", "Low-Low", "Not Signif."), 
          title = "Local Moran's I")

How does that map compare to original map of IRSAD score and deciles?

6 Further topics

  1. Construct neighbours using argument queen = FALSE. What differences can you see. In what types of situations would you expect it boe the biggest issue?

  2. Examine global and local autocorrelation of other SEIFA index.